Author(s): Zongcheng Li; Yasi Zhang
Reviewer(s): Ying Ge
Date: r Sys.Date()

Academic Citation

If you use this code in your work or research, we kindly request that you cite our publication:

Xiaofan Lu, et al. (2025). FigureYa: A Standardized Visualization Framework for Enhancing Biomedical Data Interpretation and Research Efficiency. iMetaMed. https://doi.org/10.1002/imm3.70005

需求描述

Demand description

画出这种连线图。

Draw this connection diagram.

出自:https://molecular-cancer.biomedcentral.com/articles/10.1186/s12943-021-01322-w,跟FigureYa260CNV出自同一篇文章

图5 与WM评分相关的转录及转录后调控特征。 a TCGA-COAD/READ队列中WM评分高/低组间miRNA靶向信号通路的差异。红线表示高WM评分组中低表达的miRNA,蓝线表示低WM评分组中高表达的miRNA。红点对应高WM评分组中高表达的miRNA靶基因,蓝点对应低WM评分组中高表达的miRNA靶基因。圆圈代表靶基因富集的信号通路。

Source: https://molecular-cancer.biomedcentral.com/articles/10.1186/s12943-021-01322-w, from the same article as FigureYa260CNV.

Fig. 5 Transcriptional and post-transcriptional characteristics associated with the WM_Score. a Differences in miRNA-targeted signaling pathways in the TCGA-COAD/READ cohort between the WM_Score-high and -low groups. The red line represents a low expression of miRNA in the high WM_Score group, and the blue line represents a high expression of miRNA in the low WM_Score group. Red dots correspond to miRNA-targeted genes highly expressed in the high WM_Score group, and blue dots correspond to miRNA-targeted genes highly expressed in the low WM_Score group. The circle represents a signaling pathway enriched with targeted genes.

类似的图:

A similar image:

出自:https://doi.org/10.1038/s42255-019-0045-8,跟FigureYa174squareCross、FigureYa199crosslink、FigureYa256panelLink出自同一篇文章。这篇文章以连线著称,总是被模仿,不知道会不会被超越。

图3 | 倾向性评分算法概述及跨癌种低氧相关分子模式。 c图展示1,074个癌细胞系中低氧相关基因mRNA表达水平与药物敏感性之间的斯皮尔曼等级相关性。x轴上的深绿色圆点代表低氧相关基因;橙色圆点表示按不同信号通路聚类的药物。橙色圆点的大小反映与药物敏感性相关基因的数量(|rs| > 0.3且FDR < 0.05);条形图显示与基因存在相关性的药物数量。粉色与青色线条分别表示正相关与负相关。JNK指Jun N末端激酶。

Source: https://doi.org/10.1038/s42255-019-0045-8, from the same article as FigureYa174squareCross, FigureYa199crosslink, and FigureYa256panelLink. This paper is renowned for its connection designs—constantly imitated, yet to be surpassed.

Fig. 3 | overview of the propensity score algorithm and the hypoxia-associated molecular patterns across cancer types. c, Association between mRNA expression levels of hypoxia-associated genes and drug sensitivity across 1,074 cancer cell lines by Spearman’s rank correlation. The dark green dots along the x axis indicate hypoxia-related genes; the orange dots denote drugs that are clustered by different signalling pathways. The size of the orange dot indicates the number of genes correlated with drug sensitivity (|rs| > 0.3, FDR < 0.05); the bar plot shows the number of drugs correlated with the genes. The pink and cyan lines indicate positive and negative correlation, respectively. JNK, Jun N-terminal kinase.

应用场景

Application scenarios

展示miRNA-靶基因(或基因-药物等)的关系,连线和节点的颜色代表节点类型(例如例文的high和low WM_Score)。同一通路的基因画在同一圆圈里,并标注通路名。

为了画这个图,完善了crosslink包,该R包会继续添加更多有趣的连线功能,感兴趣可前往https://github.com/zzwch/crosslink查看最新版本及功能,在github上还能提交issue跟作者直接交流。

This figure displays miRNA-target gene (or gene-drug, etc.) relationships, where the colors of connecting lines and nodes represent node types (e.g., high vs. low WM_Score as shown in the example). Genes from the same pathway are grouped within circular clusters labeled with pathway names.

To create this visualization, we enhanced the crosslink R package, which will continue to incorporate more innovative connection features. Those interested can visit https://github.com/zzwch/crosslink to explore the latest version and functionalities. GitHub also allows users to submit issues for direct communication with the author.

环境设置

Environment Setup

source("install_dependencies.R")
## Starting R package installation...
## ===========================================
## 
## Installing CRAN packages...
## Package already installed: ggplot2 
## Package already installed: magrittr 
## Package already installed: tidyverse 
## Package already installed: rlang 
## Package already installed: patchwork 
## 
## ===========================================
## Package installation completed!
## You can now run your R scripts in this directory.
source("crosslink.R") # From R package crosslink
source("layout.R") # From R package crosslink
source("transfromation.R") # From R package crosslink
source("utils.R") # From R package crosslink

library(magrittr)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.2
## ✔ ggplot2   4.0.0     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ tidyr::extract()   masks magrittr::extract()
## ✖ dplyr::filter()    masks stats::filter()
## ✖ dplyr::lag()       masks stats::lag()
## ✖ purrr::set_names() masks magrittr::set_names()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(rlang)
## 
## Attaching package: 'rlang'
## 
## The following objects are masked from 'package:purrr':
## 
##     %@%, flatten, flatten_chr, flatten_dbl, flatten_int, flatten_lgl,
##     flatten_raw, invoke, splice
## 
## The following object is masked from 'package:magrittr':
## 
##     set_names
library(patchwork)

# 显示英文报错信息
# Show English error messages
Sys.setenv(LANGUAGE = "en") 

# 禁止chr转成factor
# Prevent character-to-factor conversion
options(stringsAsFactors = FALSE) 

输入文件

Input Files

easy_input_links.csv,连线表示source(miRNA)和target(靶基因)的关系。连线的颜色表示source的类型source_type(high WM_Score和low WM_Score)。

easy_input_nodes.csv,key(包括source和target)所在的path(通路)信息。

easy_input_links.csv represents the relationship between the source (miRNA) and the target (target gene), where the color of the line indicates the type of the source (source_type: high WM_Score and low WM_Score).

easy_input_nodes.csv contains the pathway information (path) for the keys (including both source and target).

links <- read.csv("easy_input_links.csv")
nodes <- read.csv("easy_input_nodes.csv")

# 获取所有通路的名字
# Get all pathway names
paths <- unique(nodes[nodes$path != "source", ]$path) 

# 把"source"排在前面
# Place "source" first  
nodes$path <- factor(nodes$path, levels = c("source", paths)) 

# 连线的颜色
# Line colors 
src_up_col <- "red"
src_dn_col <- "blue"

# target节点的颜色
# Target node colors 
tar_up_col <- "red"
tar_dn_col <- "blue"

开始画图

Plotting

1. 快速预览

1. Take a glance

重要提示!节点和边数据中不能包含列名 ‘node’、‘cross’、‘node.type’、‘x’、‘y’、‘degree’!

IMPORTANT! The colnames of ‘node’, ‘cross’, ‘node.type’, ‘x’, ‘y’, ‘degree’ MUST NOT BE included in nodes and edges!

# 使用crosslink函数创建网络布局对象
# Create network layout object using crosslink function
toy <- crosslink(
  nodes = nodes, 
  edges = links,
  cross.by = "path", 
  xrange = c(0, 10),
  yrange = c(-5, 5),
  spaces = "partition")

# 绘制网络布局图
# Plot the network layout
cl_plot(toy)
## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

2. 按通路将靶点转换为圆形分布

2. Transform the targets into circle by pathways

# 自定义函数
# Custom function
toCircle <- function(x, y, rx = 1, ry =1, intensity = 2){
  mapTo2pi <- function(x) {scales::rescale(c(0, x), to = c(0, 2*pi))[-1]}
  data.frame(x, y) %>%
    mutate(group = paste0("group", x)) %>%
    mutate(yy = scales::rescale(-x, to = range(y))) %>%
    mutate(xx = mean(x) + intensity * sin(yy %>% mapTo2pi),) %>%
    group_by(group) %>%
    mutate(tri = rank(y, ties.method = "first") %>% mapTo2pi)  %>%
    ungroup() %$%
    data.frame(
      x = xx + rx*sin(tri),
      y = yy + ry*cos(tri))
}

# 应用圆形变换函数到网络布局对象
# Apply circular transformation to network layout object
toy_circle <- toy %>% tf_fun(
  crosses = paths, 
  along = "xy",
  fun = toCircle,
  rx = 0.2, ry = 0.2)

# 绘制变换后的网络图(不显示标签)
# Plot transformed network (without labels)
toy_circle %>% cl_plot(label = NA)

3. 后处理

3. Post-processing

## 对圆形布局进行几何变换
## Geometric transformations for circular layout
toy_final <- toy_circle %>% 
  tf_rotate(angle = -90) %>% 
  tf_flip(axis = "x", crosses = paths) %>%
  tf_shift(y = 8, crosses = paths, relative = F) %>%
  set_header()

## 可视化后处理结果
## Visualize post-processing results
toy_final %>% cl_plot(label = NA) %>% cl_void()

4. 微调

4. Fine tuning

# 显示可用图形属性
# Display available graphic attributes
show_aes(toy_final)
## Available meta.data names are showing below.
## Cross: node, node.type, x, y, cross, key, type, path, signif, degree 
## Link: src, tar, source_type, src.cross, tar.cross, source, target, src.degree, tar.degree, x, y, xend, yend 
## Header: node, node.type, x, y, cross, header
ggplot() +
  # 每个模块相对独立,可根据需要调整不同图层的叠加顺序
  # Each module is relatively independent, you can adjust the layer stacking order as needed
  
  # 路径的黑色圆圈(绘制在最底层,部分会被靶点覆盖)
  # Black circles for pathways (drawn at the bottom layer, partially covered by target points)
  ggforce::geom_circle(
    mapping = aes(x0 = x0, y0 = y0, r = r),
    data = get_cross(toy_final) %>% filter(cross != "source") %>% 
      group_by(path) %>%
      transmute(
        x0 = mean(x),
        y0 = mean(y),
        r = 0.2
      ) %>% unique(),
    show.legend = F
  ) +
  
  # 连线(miRNA-靶基因关系)
  # Connection lines (miRNA-target relationships)
  geom_segment(
    mapping = aes(x, y, xend = xend, yend = yend, color = source_type),
    data = get_link(toy_final),
    alpha = 0.3 
  ) + 
  
  # 靶点节点
  # Target nodes
  geom_point(
    mapping = aes(x, y, 
                  # size = size, 
                  color = type),
    data = get_cross(toy_final) %>% filter(cross != "source")
  ) +
  
  # 添加文字:靶点所属通路名称
  # Add text: Pathway names for targets
  ggrepel::geom_text_repel(
    mapping = aes(x, y, label = header), nudge_y = 0.3, 
    data = get_header(toy_final) %>% filter(cross != "source"),
    segment.color = NA
  ) +
  
  # 添加文字:miRNA源节点名称
  # Add text: miRNA source node names
  geom_text(
    mapping = aes(x, y, label = key), angle = 90, hjust = 1, nudge_y = -0.1,
    data = get_cross(toy_final) %>% filter(cross == "source")
  ) +
  
  # 添加文字:每个通路的靶点数量
  # Add text: Number of targets per pathway
  geom_text(
    mapping = aes(x, y, label = num),
    data = get_cross(toy_final) %>% filter(cross != "source") %>% 
      group_by(path) %>%
      transmute(
        x = mean(x),
        y = mean(y),
        num = n()
      ) %>% unique()
  ) +
 
  # 颜色配置
  # Color settings
  scale_color_manual(values = c(
      src_up = src_up_col, src_dn = src_dn_col, 
      tar_up = tar_up_col, tar_dn = tar_dn_col)) + 
  labs(x = NULL, y = "Target_Pathway") +
  scale_y_continuous(expand = expansion(mult = c(0.25,0.1))) -> p

p

如果想要像例文2那样给source也画上点,就运行下面这段

If you want to plot points for the source nodes as in Example 2, run the following code

# 画source节点
# Plot source nodes
p <- p + geom_point(
  mapping = aes(x, y),
  data = get_cross(toy_final) %>% filter(cross == "source")
  )

5. 添加注释图

5. Add annotation plots

把source的’signif’标注在source名字的下方

Place the ‘signif’ annotation below the source names

# 创建带注释的circLink图
# Create circLink plot with annotations
cl_plot2(
  p %>% cl_void(th = theme(
    axis.title = element_text())),
  object = toy_final, 
  annotation = cl_annotation(
    bottom = ggplot() +
      geom_text(
        mapping = aes(seq_along(key), 0, label = signif), 
        data = nodes %>% filter(path == "source")
      ) + theme_void() 
    ,
    bottom.by = "source", bottom.height = 0.05
  )
)

# 保存图形为PDF文件
# Save plot as PDF file
ggsave("circLink.pdf", width = 10, height = 5)

附:示例数据生成过程

Appendix: Example Data Generation Process

# 生成节点名称
# Generate node names
sources <- paste0("source", 1:20 %>% format)
targets <- paste0("target", 1:500 %>% format)
paths <- paste0("path", 1:15 %>% format)

# 创建节点数据框
# Create node dataframe
nodes <- data.frame(
  key = c(sources, targets),
  type = c(rep("src_up", length(sources)/2),
           rep("src_dn", length(sources)/2),
           sample(c("tar_up", "tar_dn"), length(targets), replace = T)),
  path = c(rep("source", length(sources)), 
           rep(paths, times = c(
             40, 50, 30, 30, 50, 50, 20, 30, 30, 40, 20, 30, 30, 20, 30
           ))) %>% factor(
             levels = c("source", paths)
           ),
  signif = c(sample(c("*", "**", "***", "ns"), length(sources), replace = T),
             rep(NA, length(targets)))
)

# 生成连接关系数据
# Generate link relationships
link_n <- 500
set.seed(666)
links <- data.frame(
  src = sample(sources, link_n, replace = T),
  tar = sample(targets, link_n, replace = T)) %>% 
  unique() %>%
  mutate(source_type = nodes$type[match(src, nodes$key)])

# 保存示例数据文件
# Save example data files
write.csv(links, "easy_input_links.csv", row.names = F, quote = F)
write.csv(nodes, "easy_input_nodes.csv", row.names = F, quote = F)

Session Info

sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8       
##  [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
##  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C          
## [10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
## 
## time zone: UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] patchwork_1.3.2 rlang_1.1.6     lubridate_1.9.4 forcats_1.0.0  
##  [5] stringr_1.5.2   dplyr_1.1.4     purrr_1.1.0     readr_2.1.5    
##  [9] tidyr_1.3.1     tibble_3.3.0    ggplot2_4.0.0   tidyverse_2.0.0
## [13] magrittr_2.0.4 
## 
## loaded via a namespace (and not attached):
##  [1] yulab.utils_0.2.1  rappdirs_0.3.3     sass_0.4.10        generics_0.1.4    
##  [5] ggplotify_0.1.3    stringi_1.8.7      hms_1.1.3          digest_0.6.37     
##  [9] evaluate_1.0.5     grid_4.5.1         timechange_0.3.0   RColorBrewer_1.1-3
## [13] fastmap_1.2.0      jsonlite_2.0.0     ggrepel_0.9.6      aplot_0.2.9       
## [17] scales_1.4.0       tweenr_2.0.3       textshaping_1.0.3  jquerylib_0.1.4   
## [21] cli_3.6.5          crayon_1.5.3       polyclip_1.10-7    withr_3.0.2       
## [25] cachem_1.1.0       yaml_2.3.10        tools_4.5.1        tzdb_0.5.0        
## [29] gridGraphics_0.5-1 vctrs_0.6.5        R6_2.6.1           lifecycle_1.0.4   
## [33] ggfun_0.2.0        fs_1.6.6           MASS_7.3-65        ragg_1.5.0        
## [37] pkgconfig_2.0.3    pillar_1.11.1      bslib_0.9.0        gtable_0.3.6      
## [41] glue_1.8.0         Rcpp_1.1.0         systemfonts_1.2.3  ggforce_0.5.0     
## [45] xfun_0.53          tidyselect_1.2.1   knitr_1.50         farver_2.1.2      
## [49] htmltools_0.5.8.1  rmarkdown_2.29     labeling_0.4.3     compiler_4.5.1    
## [53] S7_0.2.0